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The amount of information exchanged per unit of time between two nodes in a dynamical network 
or between two data sets is a powerful concept for analysing complex systems. This quantity, 
known as the mutual information rate (MIR), is calculated from the mutual information, which 
is rigorously defined only for random systems. Moreover, the definition of mutual information is 
based on probabilities of significant events. This work offers a simple alternative way to calculate 
the MIR in dynamical (deterministic) networks or between two data sets (not fully deterministic), 
and to calculate its upper and lower bounds without having to calculate probabilities, but rather in 
terms of well known and well defined quantities in dynamical systems. As possible applications of 
our bounds, we study the relationship between synchronisation and the exchange of information in 
a system of two coupled maps and in experimental networks of coupled oscillators. 
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I. INTRODUCTION 

Shannon's entropy quantifies information It mea- 
sures how much uncertainty an observer has about an 
event being produced by a random system. Another im- 
portant concept in the theory of information is the mu- 
tual information [1]. It measures how much uncertainty 
an observer has about an event in a random system X 
after observing an event in a random system Y (or vice- 
versa) . 

Mutual information is an important quantity because 
it quantifies not only linear and non-linear interdepen- 
dencies between two systems or data sets, but also is a 
measure of how much information two systems exchange 
or two data sets share. Due to these characteristics, it 
became a fundamental quantity to understand the devel- 
opment and function of the brain S 3, to characterise 
[j, [|| and model complex systems p-HI or chaotic sys- 
tems, and to quantify the information capacity of a com- 
munication system Q. When constructing a model of 
a complex system, the first step is to understand which 
are the most relevant variables to describe its behaviour. 
Mutual information provides a way to identify those vari- 
ables on. 

However, the calculation of mutual information in 
dynamical networks or data sets faces three main 
difficulties 0, [lll - [l3j |. Mutual information is rigorously 
defined for random memoryless processes, only. In ad- 
dition, its calculation involves probabilities of significant 
events and a suitable space where probability is calcu- 
lated. The events need to be significant in the sense that 
they contain as much information about the system as 
possible. But, defining significant events, for example 
the fact that a variable has a value within some partic- 
ular interval, is a difficult task because the interval that 



provides significant events is not always known. Finally, 
data sets have finite size. This prevents one from cal- 
culating probabilities correctly. As a consequence, mu- 
tual information can often be calculated with a bias, only 

In this work, we show how to calculate the amount of 
information exchanged per unit of time [Eq. the 
so called mutual information rate (MIR), between two 
arbitrary nodes (or group of nodes) in a dynamical net- 
work or between two data sets. Each node representing 
a d-dimensional dynamical system with d state variables. 
The trajectory of the network considering all the nodes 
in the full phase space is called "attractor" and repre- 
sented by S. Then, we propose an alternative method, 
similar to the ones proposed in Refs. [2 EH, to calcu- 
late significant upper and lower bounds for the MIR in 
dynamical networks or between two data sets, in terms 
of Lyapunov exponents, expansion rates, and capacity di- 
mension. These quantities can be calculated without the 
use of probabilistic measures. As possible applications of 
our bounds calculation, we describe the relationship be- 
tween synchronisation and the exchange of information 
in small experimental networks of coupled Double-Scroll 
circuits. 

In previous works of Refs. [H], Ell, we have proposed 
an upper bound for the MIR in terms of the positive con- 
ditional Lyapunov exponents of the synchronisation man- 
ifold. As a consequence, this upper bound could only be 
calculated in special complex networks that allow the ex- 
istence of complete synchronisation. In the present work, 
the proposed upper bound can be calculated to any sys- 
tem (complex networks and data sets) that admits the 
calculation of Lyapunov exponents. 

We assume that an observer can measure only one 
scalar time series for each one of two chosen nodes. These 
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two time series are denoted by X and Y and they form a 
bidimensional set Eq = (X,Y), a projection of the "at- 
tractor" into a bidimensional space denoted by il. To 
calculate the MIR in higher-dimensional projections fi, 
see Supplementary Information. Assume that the space 
fl is coarse-grained in a square grid of N 2 boxes with 
equal sides e, so N — 1/e. 

Mutual information is defined in the following way [l| . 
Given two random variables, X and Y, each one pro- 
duces events i and j with probabilities Px{i) and Py{j), 
respectively, the joint probability between these events is 
represented by Pxy(i,j)- Then, mutual information is 
defined as 

Is = Hx + Hy-H xy . (1) 

Hx = -T: i Px (i) log [P x (i)],H Y = - £ . P Y (j) log [P Y (j)], 
and Hxy = Pxv{i,j) log [PxY(i,j)]- For simplifica- 

tion in our notation for the probabilities, we drop the 
subindexes x, y, and xy, by making Px(i) — P{i)i 
P Y (j) = P{j), and P X Y(i,j) = P(i,j)- When using Eq. 
([T]) to calculate the mutual information between the dy- 
namical variables X and Y, the probabilities appearing 
in Eq. ([TJ) are defined such that P(i) is the probability of 
finding points in a column i of the grid, P(j) of finding 
points in the row j of the grid, and P(i,j) the probability 
of finding points where the column i meets the line j of 
the grid. 

The MIR was firstly introduced by Shannon [l] as a 
"rate of actual transmission" [l6[ and later more rigor- 
ously redefined in Refs. [13, EH- It represents the mutual 
information exchanged between two dynamical variables 
(correlated) per unit of time. To simplify the calculation 
of the MIR, the two continuous dynamical variables are 
transformed into two discrete symbolic sequences X and 
Y. Then, the MIR is defined by 



MIR= lim 

n—^oo 



Is(n) 



(2) 



where Is(n) represents the usual mutual information be- 
tween the two sequences X and Y, calculated by consid- 
ering words of length n. 

The MIR is a fundamental quantity in science. Its 
maximal value gives the information capacity between 
any two sources of information (no need for stationarity, 
statistical stability, memoryless) [l9| . Therefore, alterna- 
tive approaches for its calculation or for the calculation 
of bounds of it are of vital relevance. Due to the limit to 
infinity in Eq. ([2]) and because it is defined from prob- 
abilities, the MIR is not easy to be calculated especially 
if one wants to calculate it from (chaotic) trajectories of 
a large complex network or data sets. The difficulties 
faced to estimate the MIR from dynamical systems and 
networks are similar to the ones faced in the calculation of 
the Kolmogorov-Sinai entropy, Hks [201 ] . (Shannon's en- 
tropy per unit of time). Because of these difficulties, the 
upper bound for Hks proposed by Ruelle [2l[ in terms 
of the Lyapunov exponents and valid for smooth dynam- 
ical systems (H K $ < > wnere represent all the 
i positive Lyapunov exponents) or the Pesin's equality 



[H [H KS = J2 x t) proved in Ref. [H to be valid for 
the large class of systems that possess a SRB measure, 
became so important in the theory of dynamical systems. 
Our upper bound [Eq. (|13[) ] is a result equivalent to the 
work of Ruelle. 



II. MAIN RESULTS 

One of the main results of this work (whose derivation 
can be seen in Sec. IIIIB[) is to show that, in dynami- 
cal networks or data sets with fast decay of correlation, 
Is in Eq. ([TJ) represents the amount of mutual informa- 
tion between X and Y produced within a special time 
interval T, where T represents the time for the dynam- 
ical network (or data sets) to lose its memory from the 
initial state or the correlation to decay to zero. Correla- 
tion in this work is not the usual linear correlation, but 
a non-linear correlation defined in terms of the evolution 
of spatial probabilities, the quantity C(T) in Sec. IIII Al 
Therefore, the mutual information rate (MIR), between 
the dynamical variables X and Y (or two data sets) can 
be estimated by 

(3) 



MIR- 



T 



In systems that present sensitivity to initial conditions, 
e.g. chaotic systems, predictions are only possible for 
times smaller than this time T. This time has other 
meanings. It is the expected time necessary for a set 
of points belonging to an e-square box in f2 to spread 
over Eq and it is of the order of the shortest Poincare 
return time for a point to leave a box and return to it 
[24l [25| . It can be estimated by 



T 



T-log 



(4) 



where Ai is the largest positive Lyapunov exponent mea- 
sured in En- Chaotic systems present the mixing prop- 
erty (see Sec. IIII A|) . and as a consequence the correla- 
tion C{t) always decays to zero, surely after an infinitely 
long time. The correlation of chaotic systems can also 
decay to zero for sufficiently large but finite t = T (see 
Supplementary Information). T can be interpreted to 
be the minimum time required for a system to satisfy 
the conditions to be considered mixing. Some examples 
of physical systems that are proved to be mixing and 
have exponentially fast decay of correlation are nonequi- 
librium steady-state [26[, Lorenz gases (models of diffu- 
sive transport of light particles in a network of heavier 
particles) [27|, and billiards [HI. An example of a "real 
world" physical complex system that presents exponen- 
tially fast decay of correlation is plasma turbulence [29j j . 
We do not expect that data coming from a "real world" 
complex system is rigorously mixing and has an expo- 
nentially fast decay of correlation. But, we expect that 
the data has a sufficiently fast decay of correlation (e.g. 
stretched exponential decay or polynomially fast decays), 
implying that the system has sufficiently high sensitivity 
to initial conditions and as a consequence C(t) = 0, for 
a reasonably small and finite time t = T . 
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The other two main results of our work are presented 
in E qs. ([5]) and (O, whose derivations are presented in 
Sec. IIII CI The upper bound for the MIR is given by 



Jo = Ai - A 2 = Ai (2 -D), 



(5) 



where Ai and A2 (positive defined) represent the largest 
and the second largest Lyapunov exponent measured in 
En, if both exponents are positive. If the i-largest ex- 
ponent is negative, then we set A^ = 0. If the set En 
represents a periodic orbit, Ic — 0, and therefore there 
is no information being exchanged. The quantity D is 
defined as 

D = Jo S (Nc(t = T)) 
log (e) 

where Nc(t — T) is the number of boxes that would be 
covered by fictitious points at time T. At time t = 0, 
these fictitious points are confined in an e-square box. 
They expand not only exponentially fast in both direc- 
tions according to the two positive Lyapunov exponents, 
but expand forming a compact set, a set with no "holes" . 
At t = T, they spread over E^. 

The lower bound for the MIR is given by 



I l c = Xx(2-D ), 



(7) 



where D represents the capacity dimension of the set 
En 

log (No (c)) 



D 



lim 

£->0 



log (e) 



(8) 



where Nc represents the number of boxes in £1 that are 
occupied by points of Eq ■ 

D is defined in a way similar to the capacity dimen- 
sion, thought it is not the capacity dimension. In fact, 
D < Do, because Do measures the change in the num- 
ber of occupied boxes in Q as the space resolution varies, 
whereas D measures the relative number of boxes with a 
certain fixed resolution e that would be occupied by the 
fictitious points (in f2) after being iterated for a time T. 
As a consequence, the empty space in that is not occu- 
pied by Eq does not contribute to the calculation of Dq, 
whereas it contributes to the calculation of the quantity 
D. In addition, Nq > Nc (f° r an y e ), because while the 
fictitious points form a compact set expanding with the 
same ratio as the one for which the real points expand 
(ratio provided by the Lyapunov exponents), the real set 
of points Eq might not occupy many boxes. 



III. METHODS 

A. Mixing, correlation decay and invariant 
measures 

Denote by F T (x) a mixing transformation that rep- 
resents how a point x € En is mapped after a time 
T into Eq, and let p(x) to represent the probability 
of finding a point of Eq in x (natural invariant den- 
sity). Let I[ represent a region in O. Then, — 



J p(x)dx, for x G I[ represents the probability mea- 
sure of the region I[. Given two square boxes I[ € ft 
and I 2 € f2, if F T is a mixing transformation, then 
for a sufficiently large T, we have that the correlation 
C(T) = /i[F- T (/() n V 2 \ - ti[I{]n[I' 2 ], decays to zero, the 
probability of having a point in I[ that is mapped to I 2 
is equal to the probability of being in /{ times the prob- 
ability of being in I 2 . That is typically what happens in 
random processes. 

If the measure /tt(En) is invariant, then /i([F~ T (En)] = 
/i(Eji). Mixing and ergodic systems produce measures 
that are invariant. 



B. Derivation of the mutual information rate 
(MIR) in dynamical networks and data sets 

We consider that the dynamical networks or data sets 
to be analysed present either the mixing property or have 
fast decay of correlations, and their probability measure 
is time invariant. If a system that is mixing for a time 
interval T is observed (sampled) once every time interval 
T, then the probabilities generated by these snapshot 
observations behave as if they were independent, and the 
system behaves as if it were a random process. This 
is so because if a system is mixing for a time interval 
T, then the correlation C(T) decays to zero for this time 
interval. For systems that have some decay of correlation, 
surely the correlation decays to zero after an infinite time 
interval. But, this time interval can also be finite, as 
shown in Supplementary Information. 

Consider now that we have experimental points and 
they are sampled once every time interval T. The prob- 
ability Pxy{i,j) —> Pxy{k,l) of the sampled trajec- 
tory to follow a given itinerary, for example to fall in 
the box with coordinates and then be iterated 

to the box (k, I) depends exclusively on the probabili- 
ties of being at the box represented by Pxy(i,j), 
and being at the box (k, I), represented by PxY{k,l). 
Therefore, for the sampled trajectory, Pxy(i,j) 
Pxr(k,l) = PxY{},j)PxY(k,l). Analogously, the prob- 
ability Px(i) —> Py{j) of the sampled trajectory to fall 
in the column (or line) i of the grid and then be iterated 
to the column (or line) j is given by Px(i) Py(j) = 
Px(i)Pr(j). 

The MIR of the experimental non-sampled trajectory 
points can be calculated from the mutual information Is 
of the sampled trajectory points that follow itineraries of 
length n: 



MIR= lim 



Is(n) 
nT ' 



(9) 



Due to the absence of correlations of the sampled tra- 
jectory points, the mutual information for these points 
following itineraries of length n can be written as 

I s {n) = n[H x (n = l)+H Y (n = l)-H XY {n = 1)], (10) 
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where H x (n = 1) = - £. P x (i) log [P X (i)\, H Y (n = 
1) = -EiJVO')log[iV(j)]» and iJ X y(n = 1) = 
-Eij-PyrC^^loglPxy^.i)], and -Py(j), and 

P X y(i,j) represent the probability of the sampled trajec- 
tory points to fall in the line i of the grid, in the column j 
of the grid, and in the box of the grid, respectively. 

Due to the time invariance of the set En assumed to 
exist, the probability measure of the non-sampled trajec- 
tory is equal to the probability measure of the sampled 
trajectory. If a system that has a time invariant mea- 
sure is observed (sampled) once every time interval T, 
the observed set has the same natural invariant density 
and probability measure of the original set. As a conse- 
quence, if En has a time invariant measure, the proba- 
bilities P(i), P(j), and P(i,j) (used to calculate Is) are 
equal to Px(i), Py(j), and P XY (i,j). 

Consequently, H x (n = 1) = H Xl Hy{n = 1) = Hy, 
and H X y(n = 1) = H X y, and therefore Is(n) — nls(n). 
Substituting into Eq. Q, we finally arrive to 



MIR 



T 



(11) 



where 1$ between two nodes is calculated from Eq. ([TJ. 

Therefore, in order to calculate the MIR, we need to es- 
timate the time T for which the correlation of the system 
approaches zero and the probabilities P(i), P{j), P(i,j) 
of the experimental non-sampled experimental points to 
fall in the line i of the grid, in the column j of the grid, 
and in the box of the grid, respectively. 



C. Derivation of an upper (Ic) and lower (Ic) 
bounds for the MIR 

Consider that our attractor E is generated by a 2d 
expanding system that possess 2 positive Lyapunov ex- 
ponents Ai and A2, with Ai > A2. E G i7. Imagine a 
box whose sides are oriented along the orthogonal ba- 
sis used to calculate the Lyapunov exponents. Then, 
points inside the box spread out after a time interval 
t to e\/2exp Al * along the direction from which Ai is cal- 
culated. At t = T, ev / 2exp AlT = L, which provides T in 
Eq. (j4j, since L — \J~2. These points spread after a time 
interval t to e-\/2exp A2 * along the direction from which 
A2 is calculated. After an interval of time t = T, these 
points spread out over the set En- We require that for 
t < T, the distance between these points only increases: 
the system is expanding. 

Imagine that at t = T, fictitious points initially 
in a square box occupy an area of e-\/2 exp A2T L = 
2e 2 exp( A2+Al ) T . Then, the number of boxes of sides 
e that contain fictitious points can be calculated by 
N c = 2e 2 exp( Al+A2 ) T /2e 2 = exp( Al+A2 ) T . FromEq. flU, 
N = exp AlT , since N = 1/e. 

We denote with a lower-case format, the probabili- 
ties p(i), p(j), and p(i,j) with which fictitious points 



occupy the grid in f2. If these fictitious points spread uni- 
formly forming a compact set whose probabilities of find- 
ing points in each fictitious box is equal, then p(i) = 1/N 
(= In^tH P(J) = W an d p{hj) = V^C Let us de- 
note the Shannon's entropy of the probabilities p(i,j), 
p(i) and p(j) as h x , hy, and h X y. The mutual infor- 
mation of the fictitious trajectories after evolving a time 
interval T can be calculated by Ig = h x + hy — h X y. 
Since, p(i) = p(j) = 1/N and p(i,j) = 1/Nc, then 
1% = 21og(iV) - log (N c ). At t = T, we have that 
N = exp AlT and N c = exp (Al+A2)T , leading us to 
2$ = (Ai - A 2 )T. Therefore, defining, I c = 1%/T, we 
arrive at Ic = Ai — A2- 
We defining D as 



D = — 



log (N c (t = T)) 
log(e) 



(12) 



where Nc (t = T) being the number of boxes that would 
be covered by fictitious points at time T. At time t = 0, 
these fictitious points are confined in an e-square box. 
They expand not only exponentially fast in both direc- 
tions according to the two positive Lyapunov exponents, 
but expand forming a compact set, a set with no "holes" . 
At t = T, they spread over En- 
Using e = exp~ AlT and N c = exp ( - Xl+x ^ T in Eq. ([12]). 
we arrive at D = 1 + y 2 - , and therefore, we can write that 



/o = Ai-A2 = Ai(2-D), 



(13) 



To calculate the maximal possible MIR, of a random 
independent process, we assume that the expansion of 
points is uniform only along the columns and lines of the 
grid defined in the space fi, i.e., P(i) = P(j) = 1/N, 
(which maximises H x and Hy), and we allow P(i,j) to 
be not uniform (minimising H X y) for all i and j, then 



Is(e) = -21og(c) + j) log [P(i,j)]. 



(14) 



Since T(e) = -l/Ai log (e), dividing I s (e) by T(e), tak- 
ing the limit of e — ¥ 0, and reminding that the informa- 
tion dimension of the set En in the space fi is defined as 

Ei j log [P(i,j)] 



L>i=lim e ^o 
is given by 



log (e) 



we obtain that the MIR 



I S /T = X 1 {2-D 1 ) 



(15) 



Since D\ < Dq (for any value of e), then Ai(2 — D\) > 
Ai (2 — Dq), which means that a lower bound for the max- 
imal MIR [provided by Eq. (|15j) ] is given by 



r c = X 1 (2-D ), 



(16) 



But D < Dq (for any value of e), and therefore Ic is an 
upper bound for P c . 

To show why Ic is an upper bound for the maximal 
possible MIR, assume that the real points Eq occupy 
the space f2 uniformly. If Nc > N, there are many boxes 
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being occupied. It is to be expected that the proba- 
bility of finding a point in a line or column of the grid 
is P(i) = P(j) ^ 1/iV, and P(i,j) = 1/N C . In such 
a case, MIR = I l c , which implies that Ic > MIR. 
If Nc < N, there are only few boxes being sparsely 
occupied. The probability of finding a point in a line 
or column of the grid is P(i) — P(j) — X/Nc, and 
P(i,j) = 1/Nc- There are Nc lines and columns be- 
ing occupied by points in the grid. In such a case, 
Is = 21og(A> c ) - log(JV c ) ~ log (N c )- Comparing 
with Ig = 2 log (TV) - log (Nc), and since N c < N and 
Nc > Nc , then we conclude that la > Is, which implies 
that I c > MIR. 

Notice that if P(i,j) = p(i,j) = 1/N C and D x = D , 
then I s /T = I l c = I c . 



D. Expansion rates 

In order to extend our approach for the treatment of 
data sets coming from networks whose equations of mo- 
tion are unknown, or for higher-dimensional networks 
and complex systems which might be neither rigorously 
chaotic nor fully deterministic, or for experimental data 
that contains noise and few sampling points, we write our 
bounds in terms of expansion rates defined in this work 
by 

tic 1 

e k (t) = l/NcY / fog[Li(t)], (17) 

i=l 

where we consider k = 1,2. L\(t) measures the largest 
growth rate of nearby points. In practice, it is calculated 
by L\(t) = -j, with 5 representing the largest distance 
between pair of points in an e-square box i and A rep- 
resenting the largest distance between pair of the points 
that were initially in the e-square box but have spread 
out for an interval of time t. L' l 2 (t) measures how an 
area enclosing points grows. In practice, it is calculated 
by L\(t) — ^j, with e 2 representing the area occupied 
by points in an e-square box, and A the area occupied 
by these points after spreading out for a time interval 
t. There are Nc boxes occupied by points which are 
taken into consideration in the calculation of L l k (t). An 
order-A; expansion rate, efc(t), measures on average how 
a hypercube of dimension k exponentially grows after an 
interval of time t. So, e\ measures the largest growth rate 
of nearby points, a quantity closely related to the largest 
finite-time Lyapunov exponent [30|. And e 2 measures 
how an area enclosing points grows, a quantity closely 
related to the sum of the two largest positive Lyapunov 
exponents. In terms of expansion rates, Eqs. Q and 
IP) read T = i log [±] and I c = e x (2 - D), respec- 
tively, and Eqs. JT2]) and (fT6l) read D(t) = and 

~ ei (t) 

I l c = ei (2 — Dq), respectively. 

From the way we have defined expansion rates, we ex- 
pect that e/j < ^i- Because of the finite time inter- 
val and the finite size of the regions of points considered, 



regions of points that present large derivatives, contribut- 
ing largely to the Lyapunov exponents, contribute less 
to the expansion rates. If a system has constant deriva- 
tive (hyperbolic) and has constant natural measure, then 

e fc = Ei=i x i- 

There are many reasons for using expansion rates in 
the way we have defined them in order to calculate 
bounds for the MIR. Firstly, because they can be easily 
experimentally estimated whereas Lyapunov exponents 
demand huge computational efforts. Secondly, because 
of the macroscopic nature of the expansion rates, they 
might be more appropriate to treat data coming from 
complex systems that contains large amounts of noise, 
data that have points that are not (arbitrarily) close as 
formally required for a proper calculation of the Lya- 
punov exponents. Thirdly, expansion rates can be well 
defined for data sets containing very few data points: the 
fewer points a data set contains, the larger the regions of 
size e need to be and the shorter the time T is. Finally, 
expansion rates arc defined in a similar way to finite-time 
Lyapunov exponents and thus some algorithms used to 
calculate Lyapunov exponents can be used to calculate 
our defined expansion rates. 



IV. APPLICATIONS 

A. MIR and its bounds in two coupled chaotic 
maps 

To illustrate the use of our bounds, we consider the 
following two bidirectionally coupled maps 

Xi% = 2Xi 1) +pX« 2 + ( 7(X, ( 1 2) -X( 1) ),modl 

= 2Xi 2) + pXi 2)2 +a(X^ - X( 2) ),mod 1 (18) 

where X$ € [0, 1]. If p = 0, the map is piecewise-linear 
and quadratic, otherwise. We are interested in measuring 
the exchange of information between X^ 1 and X^ 2 \ The 
space f2 is a square of sides 1 . The Lyapunov exponents 
measured in the space are the Lyapunov exponents of 
the set Eq that is the chaotic attractor generated by Eqs. 
©. 

The quantities Is/T, Ic, and I l c are shown in Fig. [1] 
as we vary a for p = (A) and p = 0.1 (B). We calculate 
Is using in Eq. ([T]) the probabilities P(i,j) in which 
points from a trajectory composed of 2, 000, 000 samples 
fall in boxes of sides e=l/500 and the probabilities P(i) 
and P(j) that the points visit the intervals [(i — l)e, ie[ 

of the variable Xn^ or [(j — l)e,je[ of the variable X^\ 
respectively, for i,j = l,...,N. When computing Is/T, 
the quantity T was estimated by Eq. Indeed for 

most values of a, Ic > Is/T and I l c < Is/T. 

For cr = there is no coupling, and therefore the two 
maps are independent from each other. There is no in- 
formation being exchanged. In fact, Ic = and I l c = 
in both figures, since D = Dq = 2, meaning that the 
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FIG. 1: [Color online] Results for two coupled maps. Is/T 
[Eq. ifTTjl] as (green online) filled circles, Ic [Eq. lfT3)l ] as 
the (red online) thick line, and I l c [Eq. (| lCjf) ] as the (brown 
online) crosses. In (A) p — and in (B) p = 0.1. The units 
of Is/T, Ic, and I c are [bits/iteration]. 

attractor Eq fully occupies the space f2. This is a re- 
markable property of our bounds: to identify that there 
is no information being exchanged when the two maps are 
independent. Complete synchronisation is achieved and 
Ic is maximal, for a > 0.5 (A) and for a > 0.55 (B). A 
consequence of the fact that D = Dq = 1, and therefore, 
Ic = Ic = -^l- The reason is because for this situation 
this coupled system is simply the shift map, a map with 
constant natural measure; therefore P(i) — P(J) and 
P(i,j) are constant for all i and j. As usually happens 
when one estimates the mutual information by partition- 
ing the phase space with a grid having a finite resolution 
and data sets possessing a finite number of points, Is is 
typically larger than zero, even when there is no infor- 
mation being exchanged (a — 0). Even when there is 
complete synchronisation, we find non-zero off-diagonal 
terms in the matrix for the joint probabilities causing Is 
to be smaller than it should be. Due to numerical errors, 
XW = X( 2 \ and points that should be occupying boxes 
with two corners exactly along a diagonal line in the sub- 
space fl end up occupying boxes located off-diagonal and 
that have at least three corners off-diagonal. The es- 
timation of the lower bound I l c suffers from the same 
problems. 

Our upper bound Ic is calculated assuming that there 
is a fictitious dynamics expanding points (and producing 
probabilities) not only exponentially fast but also uni- 
formly. The "experimental" numerical points from Eqs. 
(fT8]) expand exponentially fast, but not uniformly. Most 
of the time the trajectory remains in 4 points: (0,0), 
(1,1), (1,0), (0,1). That is the main reason of why Ic is 
much larger than the estimated real value of the MIR, 
for some coupling strengths. If a two nodes in a dynam- 
ical network, such as two neurons in a brain, behave in 
the same way the fictitious dynamics does, these nodes 
would be able to exchange the largest possible amount of 



information. 

We would like to point out that one of the main 
advantages of calculating upper bounds for the MIR 
(Is/T) using Eq. (TT5]) instead of actually calculating 
Is/T is that we can reproduce the curves for Ic us- 
ing much less number of points (1000 points) than the 
ones (2,000,000) used to calculate the curve for Is/T. 
If p = 0, Ic = — ln(l — a) can be calculated since 
Ai = In (2) and A 2 = In (2 - 2a). 



B. MIR and its bounds in experimental networks 
of Double-Scroll circuits 

We illustrate our approach for the treatment of data 
sets using a network formed by an inductorless version of 
the Double-Scroll circuit [3l| . We consider four networks 
of bidirectionally diffusively coupled circuits. Topology 
I represents two bidirectionally coupled circuits, Topol- 
ogy II, three circuits coupled in an open-ended array, 
Topology III, four circuits coupled in an open-ended ar- 
ray, and Topology IV, coupled in an closed array. We 
choose two circuits in the different networks (one con- 
nection apart) and collect from each circuit a time-series 
of 79980 points, with a sampling rate of S = 80.000 sam- 
ples/s. The measured variable is the voltage across one 
of the circuit capacitors, which is normalised in order to 
make the space to be a square of sides 1. Such nor- 
malisation does not alter the quantities that we calculate. 
The following results provide the exchange of information 
between these two chosen circuits. The values of e and t 
used to course-grain the space £1 and to calculate e 2 in 
Eq. ifTTP are the ones that minimises \Nc(T, e 2 ) — Nc(e)\ 
and at the same time satisfy Nc{T, e 2 ) > Nc(e), where 
Nc(T, e 2 ) = exp Tfi2 (*) represents the number of fictitious 
boxes covering the set Eq in a compact fashion, when 
t = T. This optimisation excludes some non-significant 
points that make the expansion rate of fictitious points 
to be much larger than it should be. In other words, we 
require that e 2 describes well the way most of the points 
spread. We consider that t used to calculate ek in Eq. 
(|17p is the time for points initially in an e-side box to 
spread to 0.8L. That guarantee that nearby points in 
Eq are expanding in both directions within the time in- 
terval [0,T]. Using 0AL < t < 0.8L produces already 
similar results. If t > 0.8L, the set En might not be only 
expanding. T might be overestimated. 

Is has been estimated by the method in Ref. f32|. 
Since we assume that the space Q where mutual infor- 
mation is being measured is 2D, we will compare our 
results by considering in the method of Ref. 32] a 2D 
space formed by the two collected scalar signals. In the 
method of Ref. [13] the phase space is partitioned in 
regions that contain 30 points of the continuous trajec- 
tory. Since that these regions do not have equal areas (as 
it is done to calculate Ic and I l c ), in order to estimate 
T we need to imagine a box of sides ek, such that its 
area e\ contains in average 30 points. The area occupied 
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FIG. 2: [Color online] Results for experimental networks of 
Double-Scroll circuits. On the left-side upper corner pic- 
tograms represent how the circuits (filled circles) are bidi- 
rectionally coupled. Is/Tk as (green online) filled circles, Ic 
as the (red online) thick line, and I c as the (brown online) 
squares, for a varying coupling resistance R. The unit of these 
quantities shown in these figures is (kbits/s). (A) Topology 
I, (B) Topology II, (C) topology III, and (D) Topology IV. 
In all figures, Do increases smoothly from 1.25 to 1.95 as R 
varies from O.lkfi to 5kf2. The line on the top of the figure 
represents the interval of resistance values responsible to in- 
duce almost synchronisation (AS) and phase synchronisation 
(PS). 



by the set So is approximately given by e 2 Nc, where 
Nc is the number of occupied boxes. Assuming that 
the 79980 experimental data points occupy the space Q 
uniformly, then on average 30 points would occupy an 



area of 



30 
79980 



e Nc- The square root of this area is the 



side of the imaginary box that would occupy 30 points. 
So, e k = y^ 79 3 9°8o ^c e - Then, in the following, the "ex- 
act" value of the MIR will be considered to be given by 
Is/T k , where T k is estimated by T k = -± log (e k ). 

The three main characteristics of the curves for the 
quantities Is/Tk, Ic, and I l c (appearing in Fig. [5]) with 
respect to the coupling strength are that (i) as the cou- 
pling resistance becomes smaller, the coupling strength 
connecting the circuits becomes larger, and the level 
of synchronisation increases followed by an increase in 
Is/Tk, Ic, and I l Cl (ii) all curves are close, (hi) and as 
expected, for most of the resistance values, Ic > Is/Tk 
and I l c < I s /T k . The two main synchronous phenomena 
appearing in these networks are almost synchronisation 
(AS) [33|, when the circuits are almost completely syn- 
chronous, and phase synchronisation (PS) [3J]. For the 
circuits considered in Fig. [2] AS appears for the interval 
R € [0,3] and PS appears for the interval R £ [3,3.5]. 
Within this region of resistance values the exchange of 
information between the circuits becomes large. PS was 



detected by using the technique from Refs. [35j, |36 



C. MIR and its upper bound in stochastic systems 

To analytically demonstrate that the quantities Ic and 
Is/T can be well calculated in stochastic systems, we 
consider the following stochastic dynamical toy model il- 
lustrated in Fig. |3l In it points within a small box of 
sides e (represented by the filled square in Fig. EH A)) 
located in the centre of the subspace f2 are mapped after 
one iteration of the dynamics to 12 other neighbouring 
boxes. Some points remain in the initial box. The points 
that leave the initial box go to 4 boxes along the diag- 
onal line and 8 boxes off-diagonal along the transverse 
direction. Boxes along the diagonal are represented by 
the filled squares in Fig. |3^B) and off-diagonal boxes by 
filled circles. At the second iteration, the points occupy 
other neighbouring boxes, as illustrated in Fig. EIC), 
and at the time n = T the points do not spread any 
longer, but are somehow reinjected inside the region of 
the attractor. We consider that this system is completely 
stochastic, in the sense that no one can precisely deter- 
mine the location of where an initial condition will be 
mapped. The only information is that points inside a 
smaller region are mapped to a larger region. 

At the iteration n, there will be Nd — 2 1+ ™ + 1 boxes 
occupied along the diagonal (filled squares in Fig. [3]) 
and N t = 2nN d - C{n) (filled circles in Fig. [3]) boxes 
occupied off-diagonal (along the transverse direction), 
where C(n) = for n=0, and C(n) > for n > 1 and 
n = n — T — a. a is a small number of iterations rep- 
resenting the time difference between the time T for the 
points in the diagonal to reach the boundary of the space 
n and the time for the points in the off-diagonal to reach 
this boundary. The border effect can be ignored when 
the expansion along the diagonal direction is much faster 
than along the transverse direction. 

At the iteration n, there will be iV c = 2 1+n + 1 + 
(2 1+ ™ + l)2n — C(n) boxes occupied by points. In the 
following calculations we consider that Nc = 2 1+,i (l + 
2n). We assume that the subspace fHs a square whose 
sides have length 1, and that E S f2, so L — \/2. For 
n > T, the attractor does not grow any longer along the 
off-diagonal direction. The time n = T, for the points to 
spread over the attractor E, can be calculated by the time 
it takes for points to visit all the boxes along the diagonal. 
Thus, we need to satisfy Nde^/2 = Ignoring the 

1 appearing in the expression for Nd due to the initial 
box in the estimation for the value of T, we arrive that 



T > 



ipg(iA) 



1. 



In 



log ±. This stochastic system is discrete 

order to take into consideration the initial box in the 
calculation of T, we pick the first integer that is larger 



than JSJL^i 

log (2) 

satisfies 



1, leading T to be the largest integer that 



T < 



log (f) 
log(2)- 



(19) 
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(A) 



(B) 




FIG. 3: (A) A small box representing a set of initial con- 
ditions. After one iteration of the system, the points that 
leave the initial box in (A) go to 4 boxes along the diago- 
nal line [filled squares in (B)] and 8 boxes off-diagonal (along 
the transverse direction) [filled circles in (B)]. At the second 
iteration, the points occupy other neighbouring boxes as illus- 
trated in (C) and after an interval of time n = T the points 
do not spread any longer (D). 



The largest Lyapunov exponent or the order- 1 expan- 
sion rate of this stochastic toy model can be calculated 
by Nd(n) exp Al — Nd(n+1), which take us to 



Ai=log(2). 
Therefore, Eq. fll~9|) can be rewritten as T : 



(20) 



log (e 
Ai 



The quantity D can be calculated by D = ^t^j , 
with n = T. Neglecting C[h) and the 1 appearing in 
Nc due to the initial box, we have that Nc — 2 1+T [1 + 
2 T ]. Substituting in the definition of D, we obtain D = 

(l+T) log(2Hlog(l+2 T ) _ Uging rp from Eq ^ (QTJ^ we arriye 

at 



where 



D = 1 



log (2) log(l + 2 T ) 



log (e) log (e) 
Placing D and Ai in Jo = Ai(2 — D), give us 
Jo = log (2)(l-r). 



(21) 



(22) 



(23) 



Let us now calculate Is/T. Ignoring the border effect, 
and assuming that the expansion of points is uniform, 
then P(i,j) = 1/N C and P(i) = P(j) = l/N = e. At 
the iteration n = T, we have that Is = — 21og(e) — 



log(JVc). Since N c = 2 1+T [1 + 2 T ], we can write that 
Is = -21og(e)-(l + T)log(2)-log(l + 2 T ). Placing T 
from Eq. (fl9|) into Is takes us to Is = — log (2) — log (e) — 
log (1 + 2 T ). Finally, dividing Is by T, we arrive that 



Is 
T 



log (2) 



log (2) , log(l + 2 T ) 



log(e) 



log (e) 



Iog(2)(l-r). 



(24) 



As expected from the way we have constructed this 
model, Eq. and ([23]) are equal and I c = -y-. 

Had we included the border effect in the calculation 
of Jo, denote the value by I c , we would have typically 
obtained that I c > Ic, since A2 calculated considering 
a finite space ft would be either smaller or equal than 
the value obtained by neglecting the border effect. Had 
we included the border effect in the calculation of Ig/T, 
denote the value by Ig/T, typically we would expect that 
the probabilities P(i,j) would not be constant. That is 
because the points that leave the subspace O would be 
randomly reinjected back to ft. We would conclude that 
J|/T < I s /T. Therefore, had we included the border 
effect, we would have obtained that I c > Ig/T. 

The way we have constructed this stochastic toy model 
results in D = 1. This is because the spreading of 
points along the diagonal direction is much faster than 
the spreading of points along the off-diagonal transverse 
direction. In other words, the second largest Lyapunov 
exponent, A2, is close to zero. Stochastic toy mod- 
els which produce larger A2, one could consider that 
the spreading along the transverse direction is given by 
N t = N d 2 an - C(n), with a G [0, 1]. 



D. Expansion rates for noisy data with few 
sampling points 



our quan- 
and I l c = 



In terms of the order-1 expansion rate, ei, 
titles read I c — e 1 (2 — D), T — ± log [±] , 

ei (2 — Do). In order to show that our expansion rate can 
be used to calculate these quantities, we consider that 
the experimental system is uni-dimensional and has a 
constant probability measure. Additive noise is assumed 
to be bounded with maximal amplitude ?y, and having 
constant density. 

Our order-1 expansion rate is defined as 



ei(t) 



Nc 1 

l/A>c^-log [L\(t) 
»=i 



(25) 



where L\(t) measures the largest growth rate of nearby 
points. Since all it matters is the largest distance between 
points, it can be estimated even when the experimental 
data set has very few data points. Since, in this exam- 
ple, we consider that the experimental noisy points have 
constant uniform probability distribution, e\{t) can be 
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calculated by 



e^t) = - log 



2r, 



5 + 2r] 



(26) 



where 5 + 2rj represents the largest distance between 
pair of experimental noisy points in an e-square box and 
A + 2r] represents the largest distance between pair of the 
points that were initially in the e-square box but have 
spread out for an interval of time t. The experimental 
system (without noise) is responsible to make points that 
are at most 5 apart from each other to spread to at most 
to A apart from each other. This points spread out expo- 
nentially fast according to the largest positive Lyapunov 
exponent Ai by 



A = 5 exp 



Xxt 



(27) 



Substituting Eq. (|27|) in (l26l) . and expanding log to 
first order, we obtain that e\ — Ai, and therefore, our 
expansion rate can be used to estimate Lyapunov expo- 
nents. 



V. SUPPLEMENTARY INFORMATION 

A. Decay of correlation and First Poincare Returns 

As rigorously shown in 40] , the decay with time of the 
correlation, C(t), is proportional to the decay with time 
of the density of the first Poincare recurrences, p(t,e), 
which measures the probability with which a trajectory 
returns to an e-interval after t iterations. Therefore, if 
p(t, e) decays with t, for example exponentially fast, C(t) 
will decay with t exponentially fast, as well. The re- 
lationship between C(t) and p{t) can be simply under- 
stood in chaotic systems with one expanding direction 
(one positive Lyapunov exponent). As shown in (4l| . the 
"local" decay of correlation (measured in the e-interval) 
is given by C(t, e) < p(e)p(t 7 e) — ^(e) 2 , where p,(e) is the 
probability measure of a chaotic trajectory to visit the 
e-interval. Consider the shift map x n +\ — 2x„,mod 1. 
For this map, /i(e) = e and there are an infinite number 
of possible intervals that makes C(t, e) = 0, for a finite 
t. These intervals are the cells of a Markov partition. 
As recently demonstrated by [P. Pinto, I. Labouriau, 
M. S. Baptista], in piecewise-linear systems as the shift 
map, if e is a cell in an order-i Markov partition and 
p(t, e) > 0, then p(t, e) = 2~* and by the way a Markov 
partition is constructed we have that e = 2 _t . Since 
that e = /i(e) = 2 _t , we arrive at that C(i, e) < 0, 
for a special finite time t. Notice that e = 2 _t can be 
rewritten as — In (e) = tin (2). Since for this map, the 
largest Lyapunov exponent is equal to Ai = In (2), then 
t = — In (e), which is exactly equal to the quantity T, 
the time interval responsible to make the system to lose 
its memory from the initial condition and that can be 
calculated by the time that makes points inside an initial 
e-interval to spread over the whole phase space, in this 
case [0, 11. 



B. Ic, and I l c in larger networks and 
higher-dimensional subspaces En 

Imagine a network formed by K coupled oscillators. 
Uncoupled, each oscillator possesses a certain amount of 
positive Lyapunov exponents, one zero, and the others 
are negative. Each oscillator has dimension d. Assume 
that the only information available from the network are 
two Q dimensional measurements, or a scalar signal that 
is reconstructed to a Q-dimcnsional embedding space. 
So, the subspace £q has dimension 2Q, and each sub- 
space of a node (or group of nodes) has dimension Q. 
To be consistent with our previous equations, we assume 
that we measure Mq = 2Q positive Lyapunov exponents 
on the projection T,q. If Mq ^ 2Q, then in the follow- 
ing equations 2Q should be replaced by Mq, naturally 
assuming that Mq < 2Q. 

In analogy with the derivation of Ic and I l c in a bidi- 
mensional projection, we assume that if the spreading of 
initial conditions is uniform in the subspace CI. Then, 
P{i) = represents the probability of finding trajec- 
tory points in Q-dimensional space of one node (or a 
group of nodes) and P(i,j) — represents the proba- 
bilities of finding trajectory points in the 2<5-dimensional 
composed subspace constructed by two nodes (or two 
groups of nodes) in the subspace SI. Additionally, we 
consider that the hypothetical number of occupied boxes 
N c will be given by N C (T) = exp T (^2iM. Then, we 
have that T = 1/Xi log (^V) , which lead us to 



I C = X 1 {2Q-D). 



(28) 



Similarly to the way we have derived I l c in a bidimen- 
sional projection, if Yiq has more than 2 positive Lya- 
punov exponents, then 



/^ = A 1 (2Q- J D ). 



(29) 



To write Eq. (|28p in terms of the positive Lyapunov 
exponents, we first extend the calculation of the quantity 
D to higher-dimensional subspaces that have dimension- 
ality 2Q, 



D = 1 



2Q , 



(30) 



where Ai > A2 > A3 . . . > A2Q are the Lyapunov expo- 
nents measured on the subspace Cl. To derive this equa- 
tion we only consider that the hypothetical number of 
occupied boxes Nc is given by Nc{T) = exp T ^»=2 Ai '. 

We then substitute D as a function of these exponents 
(Eq. (|30l)) in Eq. ([!§. We arrive at 



Ic = (2Q - l)Ai 



2Q 



(31) 



10 



C. Ic as a function of the positive Lyapunov 
exponents of the network 

Consider a network whose attractor E possesses M 
positive Lyapunov exponents, denoted by At, i — 
1, . . . , M. For a typical subspace £7, Ai measured on fl is 
equal to the largest Lyapunov exponent of the network. 
Just for the sake of simplicity, assume that the nodes in 
the network are sufficiently well connected so that in a 
typical measurement with a finite number of observations 
this property holds, i.e., Ai = Ai. But, if measurements 
provide that Ai >> Ai, the next arguments apply as well, 
if one replaces Ai appearing in the further calculations by 
the smallest Lyapunov exponent, say, Xk, of the network 
that is still larger than Ai, and then, substitute A2 by 
Afe+i, and so on. As before, consider that Mq = 2Q. 

Then, for an arbitrary subspace £2, Y^i=2 ^» — £i=2 
since a projection cannot make the Lyapunov exponents 
larger, but only smaller or equal. 

Defining 

1Q 

Ic = (2Q-l)X 1 -J^~X i . (32) 

Since Yn=2 ^» — S?=2 ^ ^ s eas y to see that 

Ic<Ic- (33) 

So, Ic, measured on the subspace En and a function of 
the 2Q largest positive Lyapunov exponents measured in 
En, is an upper bound for Ic, a quantity defined by the 
2Q largest positive Lyapunov exponents of the attractor 
E of the network. Therefore, if the Lyapunov exponents 
of a network are know, the quantity Ic can be used as 
a way to estimate how much is the MIR between two 
measurements of this network, measurements that form 
the subspace fi. 

Notice that Ic depends on the projection chosen (the 
subspace Q) and on its dimension, whereas Ic depends 
on the dimension of the subspace En (the number 2Q 
of positive Lyapunov exponents). The same happens for 
the mutual information between random variables that 
depend on the projection considered. 

Equation (I32|) is important because it allows us to ob- 
tain an estimation for the value of Ic analytically. As an 
example, imagine the following network of coupled maps 
with a constant Jacobian 

K 

X^ +1 = 2A« +v y £A ij (XP - A«), mod 1, (34) 
i=i 

where X <= [0, 1] and A represents the connecting adja- 
cent matrix. If node j connects to node i, then Ay = 1, 
and otherwise. 

Assume that the nodes are connected all-to-all. Then, 
the K positive Lyapunov exponents of this network are: 



Ai = log (2) and Aj = log2[l + a], with i = 2, if. Assume 
also that the subspace 51 has dimension 2Q and that 2Q 
positive Lyapunov exponents are observed in this space 
and that Ai = Ai. Substituting these Lyapunov expo- 
nents in Eq. fl32[) . we arrive at 

/o = (2Q-1) log (1 + cr). (35) 

We conclude that there are two ways for Ic to increase. 
Either one considers larger measurable subspaces Q or 
one increases the coupling between the nodes. This sug- 
gests that the larger the coupling strength is the more 
information is exchanged between groups of nodes. 

For arbitrary topologies, one can also derive analytical 
formulas for Ic in this network, since Ai for i > 2 can be 
calculated from A2 [Hj]. One arrives at 

Xi(aW2) = A 2 (a), (36) 

where uji is the ith largest eigenvalue (in absolute value) 
of the Laplacian matrix Ly = A^ + 1 A^- . 

VI. CONCLUSIONS 

Concluding, we have shown a procedure to calculate 
mutual information rate (MIR) between two nodes (or 
groups of nodes) in dynamical networks and data sets 
that are either mixing, or present fast decay of correla- 
tions, or have sensitivity to initial conditions, and have 
proposed significant upper (Ic) and lower (I l c ) bounds 
for it, in terms of the Lyapunov exponents, the expan- 
sion rates, and the capacity dimension. Since our upper 
bound is calculated from Lyapunov exponents or expan- 
sion rates, it can be used to estimate the MIR between 
data sets that have different sampling rates or experi- 
mental resolution (e.g. the rise of the ocean level and the 
average temperature of the Earth), or between systems 
possessing a different number of events. Additionally, 
Lyapunov exponents can be accurately calculated even 
when data sets are corrupted by noise of large amplitude 
(observational additive noise) [371 138(| or when the system 
generating the data suffers from parameter alterations 
( "experimental drift" ) [39| . Our bounds link information 
(the MIR) and the dynamical behaviour of the system 
being observed with synchronisation, since the more syn- 
chronous two nodes are, the smaller A2 and Dq will be. 
This link can be of great help in establishing whether two 
nodes in a dynamical network or in a complex system not 
only exchange information but also have linear or non- 
linear interdependences, since the approaches to measure 
the level of synchronisation between two systems are rea- 
sonably well known and are been widely used. If variables 
are synchronous in a time-lag fashion [34], it was shown 
in Ref. [l6[ that the MIR is independent of the delay be- 
tween the two processes. The upper bound for the MIR 
could be calculated by measuring the Lyapunov expo- 
nents of the network (see Supplementary Information), 
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which are also invariant to time-delays between the vari- 
ables. 
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